Random Indexing for Searching Large RDF Graphs
نویسندگان
چکیده
Querying large RDF spaces with traditional query languages such as SPARQL is challenging as it requires a familiarity with the structure of the RDF graph and the names (URIs) of its classes, properties and relevant individuals. In this paper, we propose a complementary approach based on Vector Space Models (VSM), more concretely Random Indexing (RI) [1] for building a semantic index for a large RDF graph. Traditionally, a semantic index captures the similarity of terms based on their contextual distribution in a large document collection, and the similarity between documents based on the similarities of the terms contained in them. By creating a semantic index for an RDF graph, we are able to determine contextual similarities between graph nodes (e.g., URIs and literals) and based on these, between arbitrary subgraphs. These similarities can be used for finding a ranked list of similar URIs/literals for given input term which can be used for exploring the repository or enriching SPARQL queries.
منابع مشابه
Random Indexing for Finding Similar Nodes within Large RDF Graphs
We propose an approach for searching large RDF graphs, using advanced vector space models, and in particular, Random Indexing (RI). We first generate documents from an RDF Graph, and then index them using RI in order to generate a semantic index, which is then used to find similarities between graph nodes. We have experimented with large RDF graphs in the domain of life sciences and engaged the...
متن کاملA Tool for Efficiently Processing SPARQL Queries on RDF Quads
We present a tool called RIQ (RDF Indexing on Quads) for efficiently processing SPARQL queries on large RDF datasets containing quads. RIQ’s novel design includes: (a) a vector representation of RDF graphs for efficient indexing, (b) a filtering index for efficiently organizing similar RDF graphs, and (c) a decrease-and-conquer strategy for efficient query processing using the filtering index t...
متن کاملApplying Random Indexing to Structured Data to Find Contextually Similar Words
Language resources extracted from structured data (e.g. Linked Open Data) have already been used in various scenarios to improve conventional Natural Language Processing techniques. The meanings of words and the relations between them are made more explicit in RDF graphs, in comparison to human-readable text, and hence have a great potential to improve legacy applications. In this paper, we des...
متن کاملRDF Triple Stores and a Custom SPARQL Front-End for Indexing and Searching (Very) Large Semantic Networks
With growing interest in the creation and search of linguistic annotations that form general graphs (in contrast to formally simpler, rooted trees), there also is an increased need for infrastructures that support the exploration of such representations, for example logical-form meaning representations or semantic dependency graphs. In this work, we lean heavily on semantic technologies and in ...
متن کاملMPI Realization of High Performance Search for Querying Large RDF Graphs using Statistical Semantics
With billions of triples in the Linked Open Data cloud, which continues to grow exponentially, very challenging tasks begin to emerge related to the exploitation of large-scale reasoning. A considerable amount of work has been done in the area of using Information Retrieval methods to address these problems. However, although applied models work on Web scale, they downgrade the semantics contai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010